Pearson and Spearman Methods

The Sensitivity Plots (Sensitivity Summary and R Squared Plot) allow a choice between calculating sensitivity metrics with either Pearson Correlation Coefficients or Spearman Rank Coefficients. Here we will elaborate on what each method actually is and potential pitfalls of both methods.

Interpreting Correlation Coefficients

Values for correlation coefficients range between -1 and 1, with values farther away from 0 indicating a stronger correlation. Correlation values greater than 0 indicate a positive correlation, i.e. Rain is positively correlated to wet sidewalks. Correlation values less than 0 indicate an inverse correlation, i.e. Sunny days are inversely correlated to wet sidewalks.

Pearson Correlation Coefficients

Pearson Correlation determine how linear correlations between variables are. A Pearson Correlation Coefficient between two variables X and Y is defined as their covariance divided by their respective standard deviations. The equation for this calculation is:

Spearman Rank Coefficients

Spearman Rank Coefficients determine how monotonic correlations between variables are. In other words, the Spearman Rank Coefficient between two variables X and Y determines whether increases in variable X cause an increase or decrease in variable Y but Spearman is not concerned with that relation being linear, quadratic, or parametric at all.

To calculate the Spearman Rank Coefficient we first must determine rankings for both variables under consideration. Rankings are given to each variable X and Y for each run based on their descending order. For example, given the data { 0.8, 0.6, 1.2, 5.7, 0.2 } the corresponding rankings would be { 3, 4, 2, 1, 5 }. In the case of coincidental data points we average rank values. For example, given the data { 0.8, 0.6, 0.8, 5.7, 0.2 } the corresponding rankings would be { 2.5, 4, 2.5, 1, 5 }. For each run we calculate the difference between X and Y's ranking (xi and yi) as di = xi - yi. The equation for the Spearman Rank Coefficient is then:

Pitfalls of Pearson and Spearman

Both Pearson and Spearman have situations in which they won't yield great results. These situations occur when two variables are correlated but the relation oscillates, such as with Y = sin(X). Depending on how the gathered data falls either method might reveal very low correlation or, in the case of a bilateral split of the trend, even see no correlation for such relationships. For such data sets the Legacy Variable Influence Profiler is recommended.

See Also Data Explorer